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Abstract 


Severe acute respiratory syndrome coronavirus (SARS-CoV) proteins belong to a large group of proteins that is difficult to 
express in traditional expression systems. The ability to express and purify SARS-CoV proteins in large quantities 1s critical for basic 
research and for development of pharmaceutical agents. The work reported here demonstrates: (1) fusion of SUMO (small ubiquitin- 
related modifier), a 100 amino acid polypeptide, to the N-termini of SARS-CoV proteins dramatically enhances expression in Esche- 
richia coli cells and (2) 6x His-tagged SUMO-fusions facilitate rapid purification of the viral proteins on a large scale. We have 
exploited the natural chaperoning properties of SUMO to develop an expression system suitable for proteins that cannot be 
expressed by traditional methodologies. A unique feature of the system is the SUMO tag, which enhances expression, facilitates puri- 
fication, and can be efficiently cleaved by a SUMO-specific protease to generate native protein with a desired N-terminus. We have 
purified various SARS-CoV proteins under either native or denaturing conditions. These purified proteins have been used to gener- 
ate highly specific polyclonal antibodies. Our study suggests that the SUMO-fusion technology will be useful for enhancing expres- 
sion and purification of the viral proteins for structural and functional studies as well as for therapeutic uses. 
© 2005 Elsevier Inc. All rights reserved. 
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Severe acute respiratory syndrome (SARS)! is a respi- the outbreak spread quickly to about 35 countries on five 
ratory illness that has only recently been reported in Asia, continents, resulting in more than 8000 cases and 800 
North America, and Europe. After the first case of the dis- deaths. At present, there is no efficacious treatment regime 
ease in humans was found in Southern China late 2002, for SARS. The need for both a reliable diagnostic assay 
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and a therapeutic agent (antiviral or vaccine) is obvious. A 
previously unknown coronavirus has been identified as 
the causative agent of SARS. Scientists at the CDC and 
other laboratories determined the genomic sequence of 
this coronavirus and named it SARS-CoV [1-3]. 

Coronavirus, a genus within the family Coronaviridae, 
contains a group of large, positive stranded, enveloped, 
pathogenic RNA viruses that infect many species of ani- 
mals, including humans. They cause respiratory, enteric, 
and central nervous system diseases [4]. The genomic 
sequence of SARS-CoV provides important information 
for the development of diagnostic tests and vaccines. 
This information affords the opportunity to express any 
SARS-CoV protein of choice for recombinant subunit 
vaccines. Development of protein-based diagnostic and 
therapeutic methods would be greatly facilitated by the 
ability to produce viral proteins of high quality in tracta- 
ble amounts, which requires protein engineering, expres- 
sion, and purification. Six proteins of SARS-CoV, 
namely Spike (S), Nucleocapsid (Nc), Envelope (E), 
SARS polymerase (RdRp), SARS protease (3CL), and 
membrane (M), have become the focus of efforts to pro- 
duce antiviral agents and vaccines against SARS. The 
SARS-CoV proteins investigated in this study are 
described briefly below. 

SARS-CoV 3CL protease (3CL, 3CLP? or MP'°) is 
the principal coronavirus protease utilized by the virus 
to process its replicase proteins into mature forms. The 
full length of the 3CL has 306 amino acids (molecular 
weight ~33.8kDa). The protease cleaves the replicase 
polyproteins (ppla and pplab) to generate RNA-depen- 
dent polymerase (RdRp), 3CL, and helicase, all crucial 
for viral replication [5—7]. Therefore, 3CL represents an 
attractive target for the design and discovery of 
coronavirus antiviral agents, as does the polymerase [8]. 
SARS-CoV Nucleocapsid protein (N or Nc) is a phos- 
phoprotein containing 423 amino acids (molecular 
weight ~46 kDa) [9]. Large quantities of the protein are 
translated on free polysomes in the cytoplasm, where 
some molecules are rapidly phosphorylated. It is known 
that the protein binds the viral RNA and forms the 
nucleocapsid, but its exact mechanisms and role in repli- 
cation are not yet clear. The Nc protein is known to have 
B and T cell epitopes and to elicit host protective 
immune responses [10,11]. Spike protein (S or Spk) is a 
glycoprotein containing 1255 amino acids [12]. Upon 
translation, it 1s inserted into the rough endoplasmic 
reticulum and glycosylated with N-linked glycans [13]. 
Some of the proteins accumulate in the Golgi apparatus, 
and a fraction of oligomeric spike protein is transported 
to the membrane, where it mediates cell-cell fusion. Like 
those of other coronaviruses, the SARS-CoV spike pro- 
tein likely contains many of the neutralizing antibody 
epitopes as well as T cell epitopes [14]. 

A supply of purified SARS-CoV proteins would be 
valuable for both clinical and investigational purposes. 


Although several strategies have been developed over 
the years to express heterologous recombinant proteins 
in bacterial, yeast, mammalian, and insect cells, the 
expression of heterologous genes in bacteria 1s by far the 
simplest and most inexpensive means available for 
research or commercial purposes. However, heterolo- 
gous gene products often fail to attain their correct 
three-dimensional (3-D) conformation, or are simply 
expressed very poorly in Escherichia coli. Selection of 
ORFs for structural genomics projects has shown that 
only ~20% of all heterologous genes expressed in E. coli 
render soluble or correctly folded proteins [15,16]. Sev- 
eral gene-fusion systems, such as NusA, maltose binding 
protein (MBP), glutathione-S-transferase (GST), ubiqui- 
tin (UB), and thioredoxin (Trx), have been developed 
[17,18]. All of these conventional methods have short- 
comings, primarily inefficient expression and/or incon- 
sistent cleavage. 

Small ubiquitin-related modifier (SUMO) is a ubiqui- 
tin-related protein that functions by covalent attachment 
to other proteins. SUMO and its associated enzymes are 
present in all eukaryotes and are highly conserved from 
yeast to humans [19-21]. SUMO has 18% sequence iden- 
tity with ubiquitin [22]. The yeast Saccharomyces cerevi- 
siae has only a single SUMO gene (SMT%3) that 1s 
essential for viability [20]. In contrast to yeast SMT3, 
three members of SUMO have been described in verte- 
brates: SUMO-1, SUMO-2, and SUMO-3. Human 
SUMO-1, a 101 amino acid polypeptide, shares 50% 
sequence identity with human SUMO-2/SUMO-3 [23], 
which are close homologues. Yeast SUMO shares 47% 
sequence identity with mammalian SUMO-1. Although 
overall sequence identity between ubiquitin and SUMO is 
only 18%, structure determination by NMR reveals that 
they share a common three-dimensional structure charac- 
terized by a tightly packed globular fold with f-sheets 
wrapped around a single o-helix [24,25]. It 1s known that 
SUMO, fused at the N-terminus with other proteins, can 
fold and protect the protein by its chaperoning properties, 
making it a useful tag for heterologous expression [26]. All 
SUMO genes encode precursor proteins with a short C- 
terminal sequence that extends from the conserved C-ter- 
minal Gly—Gly motif. SUMO proteases remove SUMO 
from proteins, by cleaving the C-termini of SUMO 
(-GGATY) in yeast to the mature form (-GG) or decon- 
jugating it from lysine side chains [27,28]. The former 
activity (protease) is useful for removal of SUMO as an 
expression tag. There are 2 SUMO proteases in yeast 
[27,28] and at least 6 in humans, the human enzymes 
ranging from 238 to 1112 amino acid residues [22,29-31]. 

We have developed a novel SUMO-fusion system 
that provides increased levels of expression of heterolo- 
gous proteins in E. coli and allows rapid purification of 
proteins of interest [26,32]. We report here the applica- 
tion of SUMO-fusion technology to the expression and 
purification of major SARS-CoV proteins. 
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Materials and methods 


SARS-CoV 3CL Protease (3CL), SARS-CoV Nucle- 
ocapsid (Nc), and SARS-CoV Spike C-terminal frag- 
ment protein (Spk C) were fused with SUMO and 
expressed in FE. coli. For expression of the proteins, 
SARS-CoV cDNA was derived from infected cell RNA, 
provided by the CDC, Atlanta, to S.R.W. (University of 
Pennsylvania). 


Construction of SUMO-SARS-Co V-fusion protein 
expression vectors 


Expression constructs encoding the SUMO-fusion 
proteins all utilized the pSUMO plasmid (LifeSensors, 
Malvern, PA) as the backbone. The pET24 derivative 
carrying the SMT%3 gene of S. cerevisiae, which encodes 
the yeast SUMO protein, has been described previously 
pSUMO [26]. It contains an N-terminal hexahistidine 
(6x His) tag, introduced by PCR into the SUMO coding 
sequence, as well as a unique Bsal site at the C-terminus. 
The cloning strategy to express fusion proteins employed 
this Bsal site to insert the SARS-CoV protein coding 
sequences in frame with SUMO. PCR primers (Table 1) 
incorporating this site or Esp 3I were used to amplify the 
SARS-CoV coding sequences from cDNA clones carried 
in pTOPO vectors. The 3’ primers carried a BamHI site 
for insertion into the multiple cloning site of pET24d. 
The primer pairs used to PCR amplify the SARS-CoV 
protein genes are listed in Table 1. Because of its large 
size, Spike protein was designed as two half-molecules, 
Sl (N-terminal fragment, amino acids 1-667, SpK N) 
and S2 (C-terminal fragment, amino acids 668-1193, 
SpK C) domains and the Spk C was tested for expres- 
sion and purification in this study. For PCR amplifica- 
tion of the genes of interest, a proofreading polymerase 
was used (Platinum Taq, Invitrogen, Carlsbad, CA). 
PCR fragments were subcloned into pET24-6x His- 
SUMO or pET24-6x His (a parallel vector that does not 
carry the SUMO sequence) to produce parallel sets of 
constructs encoding 6x His-SUMO and 6~x His fused 
versions of the proteins of interest. All plasmids were 
routinely sequenced. 


Table 1 
PCR primers for amplifying the SARS-CoV protein genes 


Proteins 
Spike protein—C terminal fragment 
Nucleocapsid protein 


Entire gene 


CL Protease Entire gene 


Region of genes 


AA 668-1193 


Expression of SARS proteins using SUMO-fusion 


To test and compare expression of the SARS pro- 
teins, a single colony of the E. coli strain BL21 (DE3) 
containing each of the plasmids described above was 
inoculated into 5 ml of either Luria—Bertani (LB) or M9 
minimal (MM) media. The antibiotic kanamycin was 
also included at 30 ug/ml in all media. The cells were 
grown at 37°C overnight with shaking at 250rpm. The 
next morning the overnight culture was transferred into 
50 ml fresh medium to permit exponential growth. When 
the OD, value reached ~0.6—0.7, protein expression 
was induced by addition of 1 mM IPTG (isopropyl-B-p- 
thiogalactopyranoside), followed by prolonged growth 
at either 37 or 20°C to determine optimal induction con- 
ditions. For protein purification, cultures were scaled up 
to 0.5—1.0 L LB medium. 

Sodium dodecyl sulfate—polyacrylamide gel electro- 
phoresis (SDS-PAGE) was used to verify expression of 
the protein. Briefly, 1.5ml samples of culture were 
removed just before expression was induced and after 
induction, and cells were collected by centrifugation at 
6000rpm for 5min. The cell pellets were suspended in 
50 ul of distilled water, and the samples were freeze— 
thawed once to facilitate disruption of the cells. The cell 
suspensions were treated with RNAse and DNAse (both 
at 40 ug/ml) to digest nucleic acids. After mixing with 
SDS-PAGE sample buffer containing SDS and f- 
mercaptoethanol, samples were heated at 95°C for 5 min 
to facilitate denaturation and reduction of proteins. Pro- 
teins were detected using SDS—polyacrylamide gels with 
Tris—glycine running buffer and Coomassie blue staining. 


Western blots 


Proteins separated by SDS-PAGE were transferred 
onto nitrocellulose membranes at 42 V (~150mA) for 
2.5h. Membranes were then incubated with 30ml of 
TTBS buffer (pH 8.0), containing 5% nonfat dry milk for 
lh at room temperature. The expressed proteins were 
probed with either monoclonal anti-His-tag or poly- 
clonal antibodies obtained from rabbits immunized 
against individual SUMO-SARS-CoV-fusion proteins 


Primers 


tttGGTCT Caaggtatgagtactagccaaaaatctattgtgge 
cecGGATCCtcatttaatatattgctcatattttc 


tttGGTCT Caaggtatgtctgataatggaccccaatc 
cecGGATCCtcatgcctgagttgaatcagcag 


tttCGTCTCaaggtagtggttttaggaaaatggcattcccg 
cecGGATCCtcattggaaggtaacaccagage 


Restriction enzyme recognition sites used for cloning are indicated in uppercase letters. Owing to the presence of a Bsal site within the 3CL protease 
coding sequences, a different restriction enzyme, Esp3I, was used at the 5’ end to join the Bsal site in the pET-SUMO vector. 
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(Rockland Immunochemicals) by incubating overnight 
at 4°C with 1: 1000 dilution of the primary antibodies. 
After the membranes were washed with TTBS buffer for 
Smin, they were incubated with a secondary antibody 
(Peroxidase-conjugated goat anti-rabbit IgG, Rockland 
Immunochemicals, diluted 1000-fold) for 45min. The 
membranes were finally washed with TTBS for 10min 
before the chemiluminescent Western blot substrates 
were applied (Roche, Mannheim, Germany), and visual- 
ized on films (Kodak BioMax). 


Purification of SARS-CoV proteins 


Because the SUMO constructs bear an N-terminal 6x 
His tag, expressed SARS-CoV proteins fused with 
SUMO can be rapidly purified by Ni-NTA affinity 
chromatography. In this study, the soluble proteins from 
E. coli cell lysates and the insoluble proteins from the cell 
inclusion bodies were purified under native and denatur- 
ing conditions, respectively. A typical procedure for 
purification of the SARS-CoV proteins is illustrated in 
Fig. 1. Protein concentrations were determined using the 
Bradford color-reaction assay (Bio-Rad) measured spec- 
trophotometrically at 595 nm with bovine serum albumin 
as a standard, according to the manufacturer’s instruc- 
tions. SDS-PAGE and Coomassie blue staining were 
used to evaluate the effectiveness of the purifications and 
cleavage of SUMO-SARS-CoV protein fusions. 


Culture £. coli to express SUMO-SARS CoV proteins 


Harvest cells 


Extract cell soluble and insoluble proteins 


Purify SUMO-fusion proteins with Ni-NTA column 


Add SUMO protease to cleave SUMO-fusion proteins 


Re-apply to Ni-NTA column to subtract SUMO and SUMO protease 


Collect purified SARS-CoV proteins 


Fig. 1. The procedure for purification of SARS-CoV proteins 
expressed with SUMO-fusion system in E. coli and cleavage of 6x 
His-SUMO-tagged proteins. 


Dialyze for removal of elution agents 


Preparation of soluble and insoluble protein samples from 
E. coli cells 

The E. coli cells expressing the SARS-CoV proteins 
were harvested from LB medium (typically, 1.0L) by 
centrifugation (8000g for 10min at 4°C). Typically, the 
wet weights of the E. coli cells harvested from 1 L culture 
were 10—15g. The cell pellets were resuspended in lysis 
buffer (PBS containing additional 150 mM NaCl, 10mM 
imidazole, 1% Triton X-100, and 1mM PMSF, pH 8.0) 
at 3ml for 1 g of the cells, resulting in ~4 mg protein per 
ml after the proteins were extracted. The cells were lysed 
by sonication (50% output for 5 x 30 second pulses). 
Sonication was conducted with the tube jacketed in wet 
ice and observing | min intervals between pulse cycles to 
prevent heating. After the lysates were incubated with 
DNase and RNase (each at 40 ug/ml) for 20min, they 
were centrifuged at 20,000g for 30min at 4°C, and 
supernatants (soluble protein fractions) were collected. 
The pellets containing inclusion bodies were washed 
three times in buffer (PBS containing 25% sucrose, 5mM 
EDTA, and 1% Trition X-100, pH 7.5) followed by cen- 
trifugation, as described above. The washed inclusion 
bodies were resuspended in denaturing solubilization 
buffer (Novagen), which contained 50mM Caps (pH 
11.0), 0.3% N-lauryl sarcosine, and 1mM DTT, and 
incubated for 30min at room temperature with shaking 
to extract the insoluble proteins. Because debris from 
inclusion bodies was much smaller than that in the cell 
lysate, the extract for the insoluble proteins was 
obtained by high-speed centrifugation (80,000g¢ for 
30 min at 4°C). 


Purification of 6x His-tagged SUMO-SARS-CoV 
proteins 

The soluble proteins extracted from E. coli cells were 
purified under native conditions and a BioLogic Duo- 
Flow FPLC system (Bio-Rad) was used for fractiona- 
tions. Briefly, the cell lysate (typically, 20-40 ml 
containing 0.2—0.5 g proteins) was loaded onto a column 
containing ~10ml Ni-NTA superflow resin (Qiagen, 
Valencia, CA) and the samples of flow-through contain- 
ing unbound proteins were collected for subsequent 
analysis. The resin was extensively washed with ~50— 
100 ml of wash buffer (PBS containing 20mM imidazole 
and additional 150mM NaCl, pH 8.0) until OD 
reached or fell below the base line (UV value=0). 
Finally, the 6x His-tagged SUMO-fusion proteins were 
eluted with elution buffer (PBS containing 300 mM imid- 
azole and additional 150mM NaCl, pH 8.0). The puri- 
fied SUMO-fused proteins eluted as a single isolated UV 
peak. The proteins with high OD,., values were collected 
in 4ml fractions that were checked on SDS-gels and 
pooled. 

The insoluble proteins extracted from the E. coli 
inclusion bodies were purified under denaturing condi- 
tions, which were similar to the native conditions 
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described above except for the use of highly alkaline pH 
buffer containing detergent. Briefly, an insoluble protein 
sample (~20-40 ml) prepared in the denaturing buffer 
(50mM Caps, 0.3% N-lauryl sarcosine, and 1mM DTT, 
pH 11.0) was incubated with ~10ml of Ni-NTA super- 
flow resin at 4°C for lh with shaking for effective 
binding of the 6x His-tagged proteins to the resin. 
The mixture was then loaded into an empty column 
and the flow-through sample was collected. Subse- 
quently, the resin was continually washed with denatur- 
ing wash buffer that contained 20mM imidazole, 0.3% 
N-lauroyl sarcosine, 0.3M NaCl, and 50mM Caps, pH 
11, until OD,., fell below the base line. The 6x His- 
tagged SUMO-fusion proteins were finally eluted using 
denaturing elution buffer that contained the same com- 
ponents as in the denaturing wash buffer, except that the 
concentration of imidazole was increased to 300 mM. 


Cleavage of SUMO-fusion by the SUMO protease 

The SUMO protease used in this study was produced 
in our laboratory as described [26], and a unit of the 
protease activity was defined as the amount of SUMO 
protease that cleaves 100 ug of SUMO-Met-GFP-fusion 
substrate at 25°C in 1h in buffer containing 20mM 
Tris-HCl, pH 8.0, and 5mM f-mercaptoethanol [26]. 
Before adding the enzyme for cleavage, the purified 
SUMO-fusion proteins (soluble fraction) were dialyzed 
with 3.5 kDa cutoff membranes against PBS (pH 7.4) for 
12-15h at 4°C to remove high salt and imidazole, while 
the purified sample in denaturing buffer were refolded by 
extensive dialysis for at least 24h against 20mM _ Tris— 
HCl (pH 8.0) containing 10% glycerol. No protein pre- 
cipitation was observed during the dialysis. The 
minimum amount of SUMO protease required for com- 
plete cleavage of a given SUMO-fusion was variable. 
Typically, for most of the purified SUMO-SARS-CoV 
proteins we added the enzyme at a ratio of 1 U to 15 ug 
substrates and incubated in either PBS (pH 7.4) or 
20mM Tris buffer (pH 8.0), containing 5mM £-mercap- 
toethanol, at 30°C for 1h. In this study, cleavage of the 
SUMO-SARS-CoV Nec protein was achieved with a 
lower amount of the SUMO protease after checking 
effectiveness of the enzyme in serial dilution (see Fig. 8). 


Removal of SUMO and SUMO protease for final 
purification of SARS-CoV proteins 

Since both SUMO and SUMO protease had 6x His 
tags, but SARS-CoV proteins did not, the cleaved 
SUMO-fusion samples could be re-applied to the nickel 
column to obtain the purified membrane proteins by 
subtracting the 6x His-tagged proteins. Briefly, after the 
SUMO-fusions were cleaved by the SUMO protease, the 
sample was loaded onto a nickel column with Ni-NTA 
resin. Most of the SARS-CoV protein without 6x His 
tags was eluted in the flow-through (unbound) fractions, 
and the rest was recovered by washing the resin with 


PBS. The eluted and washed proteins appearing in frac- 
tions with high-UV values at OD,.,) were pooled as the 
final purified sample. The purified proteins were checked 
on SDS-gels and the samples were stored at —80 °C after 
glycerol was added to 10%. 


Results 


Enhanced expression of SARS-CoV proteins with 
SUMO-fusion 


SARS-CoV proteins 3CL, Nc, and Spike C, in ver- 
sions fused to either 6x His-SUMO or 6x His, were 
expressed in E. coli cells under various conditions. The 
expressed proteins were readily identified by their migra- 
tion positions in SDS-gels based on their molecular 
weights, and were further confirmed by immunological 
reactions with their respective antibodies on Western 
blots. 

The expressed SARS-CoV 3CL protease (3CL) was 
detected in lysates of E. coli cells under several culture 
and induction conditions (Fig. 2); induced cell lysate 
samples showed appropriate protein bands (approxi- 
mately 35kDa for 3CL and 47kDa for SUMO-3CL- 
fusion) on the SDS-gels (the sequence-predicted sizes of 
3CL and SUMO-3CL are 33.8 and 45.8kDa, respec- 
tively). When fused to the 3CL, SUMO significantly 
enhanced expression of its partner protein in both LB 
and MM media under all the conditions tested, com- 
pared to the 3CL expressed without SUMO-fusion (Fig. 
2). Overnight growth (~15h) at 20°C resulted in an 
increased yield of SUMO-fused 3CL compared to a 6h 
culture at the same temperature and a 3h culture at 
37 °C (FP ig..2): 

Expressed SARS-CoV Nucleocapsid (Nc) was 
detected in either unfused (~46kDa) or SUMO-fused 
(~60kDa) versions from IPTG-induced E. coli cells 
under various culture and induction conditions (Fig. 3). 
Notably, much higher yields of the expressed proteins 
were observed from rich medium (LB) than from mini- 
mal medium (MM) (Fig. 3), suggesting the former 
should be better for large-scale production and purifica- 
tion of the proteins. Similar to the 3CL results, expres- 
sion enhancement was seen when Ne was fused to 
SUMO and expressed in minimal medium, but in LB 
medium there were no significant differences between the 
expression of Nc without SUMO and Nc fused with 
SUMO (Fig. 3). 

The SUMO-fusion also greatly increased the level of 
expression of the C-terminal half of the SARS-CoV 
Spike protein (Spk C) compared to that of unfused Spk 
C in LB media (Fig. 4). Only a very weak protein band 
(~58 kDa) of unfused Spk C could be seen in the SDS- 
gel and no band was seen in the Western blot probed 
with anti-His-tag antibodies, indicating that Spk C was 
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LB 


3CL SUMO-3CL 
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BEES BEES 
$ugG BguG 
SBRR SBRR 
97.4 - 
66.2- 
45 - 
31 - 


21.5- 
«.- ae —228 
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XCL SLMO-3SCL 

5 a 
FEES FEES 
Terenas: 


Fig. 2. Enhanced expression of SARS-CoV 3CL protease (3CL) by SUMO-fusion in E. coli. Cells grown in either Luria—Bertani (LB) or M9 minimal 
(MM) medium were induced at the temperatures and for the lengths of time indicated. Just before expression was induced and after induction was 
completed the cells from a 1.5 ml aliquot of culture were lysed. Samples of whole cell lysates (~7.5 ul) from the various expression conditions were 
resolved in 12% SDS-gels and stained with Coomassie blue. Molecular weights were as indicated, and arrowheads highlight expected/observed posi- 


tions of respective expressed protein bands. 


LB 
Ne SUMO-Ne 

Fo 
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$555 F5S 
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Fig. 3. Enhanced expression of SARS-CoV Nucleocapsid protein (Nc) by SUMO-fusion in E. coli. The conditions for cell culture, protein expression, 
and gel detection were the same as in Fig. 2. The yields of expressed SUMO-Nc proteins were higher than the Ne expressed without SUMO in mini- 
mal media, but there were no significant differences in their expression in LB media. 


poorly expressed without SUMO-fusion under the con- 
ditions tested. In contrast, an intense protein band was 
observed at the SUMO-Spk C migration position 
(~68 kDa) on the SDS-gel (Fig. 4, left panel) when SpkK 
C was fused with SUMO and the identity of the fusion 
protein was confirmed by reactions with anti-His-tag 
antibody (Fig. 4, right panel). 


Purification of SARS proteins 


Purification of SARS-CoV 3CL protease 

Fig. 5 shows detection of the proteins from a represen- 
tative purification of soluble SARS-CoV 3CL under 
native conditions. The cell lysate containing soluble 
SUMO-3CL was used for this purification, because a 
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Fig. 4. Enhanced expression of SARS-CoV Spike C protein (Spk C) by SUMO-fusion in E. coli. The cells were cultured in LB media and the condi- 
tions for cell culture, protein expression, and gel detection were the same as in Fig. 2. The Western blot (right panel) was performed with anti-His-tag 


primary antibody. 


97.4 - 


66.2- 


45 - 


31 - 


21.5 - 


14.4 - 


Fig. 5. Detection of proteins in samples from various steps of a typical 
purification of SARS-CoV 3CL protease. Aliquots of the samples 
(each containing ~5 yg protein) were separated on a 12% SDS-gel and 
stained with Coomassie blue. The migration positions of the SUMO- 
fusion and the proteins resulting from the cleavage are as indicated. 


majority of the expressed protein (>80%) was present in 
the soluble fraction (data not shown). Proteins without 
6x His tags were removed from the Ni-NTA resin using 
wash buffer containing 20 mM imidazole, and the 6x His- 
tagged SUMO-3CL-fusion was eluted using elution 
buffer containing 300mM imidazole. After the SUMO- 
3CL fractions were pooled, the sample was dialyzed 
extensively against PBS (pH 7.4) at 4°C to remove high 
salt and imidazole, which would interfere with the cleav- 
age reaction. The SUMO-fusion was cleaved by addition 
of SUMO protease at 30°C for 1h under the conditions 
described in Materials and methods. The completeness of 
cleavage was confirmed by checking the proteins on a 
12% SDS-gel, since the band of the SUMO-3CL disap- 
peared and two new bands corresponding to the expected 
molecular weights of SUMO and 3CL were detected. 
After the cleaved sample was re-applied to a Ni-NTA 
column to subtract 6x His-tagged SUMO and SUMO 
protease, final purified 3CL was obtained (Fig. 5); the 
protein from the subtracted sample ran as a single, intense 
band (~34kDa), indicating that 3CL had been purified 
successfully (>95% purity). In this experiment, a high 
yleld (totally ~56 mg) of the pure 3CL was achieved from 
1L of E. coli cultured and induced at 20°C overnight 
(Table 2). We used the anti-cSUMO-3CL-fusion antibody 
to identify the purified 3CL protein, since the antibody 


Table 2 

Summary of the SARS-CoV proteins resulting from representative purifications of 1 L E. coli culture 
Proteins 3CL 

Starting samples for purification Soluble fraction (224 mg) 

Purified SUMO-fusions 101 mg 

Purified SARS-CoV proteins 56 mg 

Purity >95% 


Ne Spk C 
Soluble fraction (189 mg) Insoluble fraction (66 mg) 
66 mg 24 mg 
26 mg 12 mg 
>95% ~30% 


The SARS-CoV proteins fused with SUMO were expressed in E. coli and induced at 20°C overnight. The wet weights of the cells harvested from 1 L of E. 
coli culture for the 3CL, Nc, and Spk C were 14, 13, and 10 g, respectively. The samples were prepared and purified as described in Materials and methods. 
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Fig. 6. Detection of proteins from various steps of a typical purification 
of SAR-CoV Nucleocapsid proteins. Aliquots (~5 ug proteins) of the 
samples were resolved in a 12% SDS-gel and stained with Coomassie 
blue. The migration positions of molecular weight markers, SUMO- 
fusion, and proteins resulting from the cleavage are as indicated. 


could react with the SUMO-3CL-fusion and their cleaved 
partners. The purified protein was confirmed to be the 
SARS-CoV 3CL by the Western blot probed with the 
anti-SUMO-3CL antibody (see below and Fig. 7). 


Purification of SARS-CoV Nucleocapsid protein 

Similar to the SARS-CoV 3CL protease, most of the 
expressed SUMO-Nc protein was found in the soluble 
fraction from E. coli cells, and therefore the supernatant 
of the cell lysate was used for purification of the SARS- 
CoV Nc protein. The proteins resulting from various 
steps in the purification procedure were detected using 
SDS-PAGE (Fig. 6). Using Ni-NTA affinity to purify 
the 6x His-tagged SUMO-fusion was an efficient 
method, since only a single, high-density protein band 
was detected in the eluted fractions (Fig. 6). After the 
purified sample was dialyzed and the SUMO protease 
added under the conditions described above, complete 
cleavage of the fusion was achieved. A single, highly 
intense band (~46 kDa) was detected 1n the final purified 
sample, indicating that >95% pure SARS-CoV Ne was 
obtained (Fig. 6). In this experiment, approximately 
26 mg of the Ne was purified from the 1 L E. co/i culture 
(Table 2). The protein’s identity was confirmed by its 
reaction with the anti-SUMO-N¢c antibody (see Fig. 7). 


Detection of the purified SARS-CoV 3CL and Nc proteins 
using Western blots 

Fig. 7 shows that the SUMO-3CL-fusion antibody 
reacted specifically with the purified 3CL, with a little 
cross-reactivity with Nc; likewise, SUMO-Nc-fusion 
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Fig. 7. Detection of reactions of the purified Nc and 3CL proteins with 
the respective SUMO-fusion antibodies. The purified SARS-CoV pro- 
teins were detected on the Coomassie blue stained 12% SDS-gel (A) 
and the Western blots were probed with the SUMO-3CL antibody (B) 
and with the SUMO-Nc antibody (C). The amounts of the purified 
proteins loaded for the SDS-gel and the Western blots were 4 and 2 
ug, respectively. Arrowheads highlight observed positions of respec- 
tive SARS protein bands. 


antibody had a highly specific reaction to purified Ne, 
without any cross-reaction with 3CL. The results not 
only confirmed the identities of the SARS-CoV proteins 
but also suggested that the purified SARS proteins 
maintained their immunity response properties. 


Effects of variations in the amount SUMO protease on 
cleavage of SUMO-Nc-fusion proteins 

To evaluate the effectiveness of SUMO protease on the 
cleavage of SUMO-SARS-CoV proteins, serial 1:1 dilu- 
tions of the enzyme (starting at 2.0 U) were used to digest 
aliquots (l0ug) of purified SUMO-Nc in PBS (pH 7.4) 
containng 5mM f-mercaptoethanol at 30°C for lh 
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Fig. 8. Effect of different amounts of SUMO protease on cleavage of 
SUMO-Nc-fusion. Serially diluted SUMO protease was added to 
10 ug of purified SUMO-Nc-fusion protein and incubated in 30 ul PBS 
containing 5mM f-mercaptoethanol at 30°C for 1h. The protease 
used was 20 U/ug. Aliquots (~12 pl) from the incubation mixture were 
resolved on a 12% SDS-—polyacrylamide gel and stained with Coo- 
massie blue. Lanes: 0, uncleaved fusion (control); 1, 2 U of the prote- 
ase; 2, 1 U; 3, 0.5 U; 4, 0.25 U; 5, 0.125 U; 6, 0.063 U; 7, 0.032 U; 8, 
0.016 U; 9, 0.008 U; 10, 0.004 U. 
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(Fig. 8). Since it is known that SUMO has a molecular 
mass of 11.5kDa (although it migrates ~20kDa in an 
SDS-—polyacrylamide gel), and the Ne band is ~46kDa, 
cleavage 1s judged to be successful if the protein band rep- 
resenting full-length substrate fusion (e.g., 20 +46=66kDa 
in the case of SUMO-Nc) disappears and new bands corre- 
sponding to the expected molecular weights of the 
hydrolysis products are detected. Fig. 8 shows that as little 
as 0.063 U of the enzyme cleaved >95% of 10 ug of SUMO- 
Ne-fusion (lane 6) and 0.008U cleaved ~50% of the 
substrate (Lane 9) under the tested conditions. 


Purification of SARS-CoV Spike C protein 

When fused with SUMO, the C-terminal half of 
SARS-CoV Spike protein (Spk C) was expressed at high 
levels in E. coli (Fig. 9A). Because approximately 60% of 
the total fusion protein expressed was in the bacterial 
inclusion bodies, the insoluble protein sample extracted 
from the inclusion bodies (Fig. 9A, lane 3) was used for 
purification of the Spk C with Ni-NTA. affinity 
chromatography under denaturing conditions. Briefly, 
the 6x His-tagged SUMO-Spike C-fusion was eluted by 
elution buffer containing 300mM imidazole, but a few 
other minor proteins that were without 6x His tags but 
possibly rich in histidine and/or cysteine were also 
bound to the resin, resulting in impurities of the sample. 
The unwanted proteins did not interfere with the cleav- 
age of the SUMO-fusion proteins, but reduced the 
purity of the sample (Fig. 9B, lane 1). After the purified 
SUMO-Spike C protein was extensively dialyzed, the 
fusion was effectively cleaved by addition of SUMO pro- 
tease (>95% cleavage was achieved, see Fig. 9B, lane 2). 
Finally, the 6x His-tagged SUMO and SUMO protease 
were removed by applying the cleaved sample to the Ni— 
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Fig. 9. Detection of SUMO-SARS CoV Spike C proteins from 
expressed E. Coli cells (A) and various steps during a representative 
purification of Spike C proteins (B). Aliquots (~ 5 ug protein) of sam- 
ples were resolved in 12% SDS-gels and stained with Coomassie blue. 
Lanes in (A): 1, uninduced cells (control); 2, induced cells; 3, insoluble 
fraction (the starting material for purification). Lanes in (B): M, 
molecular weight markers; 1, purified SUMO-Spike C-fusion; 2, 
cleaved SUMO-Spike C-fusion; 3, purified Spike C sample. The migra- 
tion positions of the respective proteins are as indicated. 


NTA column to purify the Spk C. An SDS-gel of the 
resulting sample showed unfused Spk C (~58kDa) 
along with three minor proteins (see Fig. 9B, lane 3), 
indicating that partially purified Spk C was obtained. 
Alternative purification approaches can be used after the 
Ni-NTA purification to get rid of the impurities if >90% 
purity is required. In this study, approximately 12 mg of 
the partially purified SpK C sample was obtained from 
the 1 L E. coli culture (Table 2). 


Discussion 


At least six types of protein are encoded by the SARS 
coronavirus (SARS-CoV) genome. Large-scale produc- 
tion of these proteins in pure, functionally active form is 
critical to meet urgent needs in the development of diag- 
nostic and therapeutic methods for SARS, such as anti- 
viral drugs and vaccines, as well as for basic research 
purposes. Such a task is difficult using conventional 
expression systems. 

Several major protein fusion technologies have been 
developed to improve expression and purification of het- 
erologous recombinant proteins in bacterial, yeast, mam- 
malian, and insect cells. These include maltose binding 
protein (MBP), glutathione-S-transferase (GST), and thi- 
oredoxin (Trx) gene fusion systems [17,33]. However, 
many proteins are not expressed well with these fusion 
systems in commonly utilized hosts. Fusion of an unsta- 
ble or misfolded protein with proteins such as ubiquitin 
and ubiquitin-like proteins, which have a highly evolved 
structure, can stabilize the candidate protein. We have 
conducted a systematic comparison of the effectiveness of 
various fusion tags (MBP, GST, Trx, NusA, and SUMO) 
when used as GFP fusions expressed in E. coli, and have 
found SUMO to be superior to the other tags for expres- 
sion of the protein. 

GST and MBP domains have been used as tags to 
enhance production and purification of proteins of inter- 
est [33]. Problems are encountered, however, when these 
tags must be removed to study the protein’s structure by 
X-ray crystallography or NMR. Although several prote- 
ases such as thrombin, Factor Xa, and AcTEV protease 
are used for these purposes, all of these enzymes recog- 
nize short degenerate sequences, and, thus, cleavage can 
occur within the proteins of interest. Another problem 
encountered is inaccessibility of the cleavage site within 
the fusion due to steric constraints, which could reduce 
the effectiveness of enzymic cleavage. The SUMO tag, by 
contrast, is accurately and efficiently removed from the 
protein of interest [26]. Comparing the cleavage of 
SUMO-GFP by SUMO protease to the cleavage of 
NusA-GFP by AcTEV protease, we found that SUMO 
protease had a 64-fold higher activity than AcTEV pro- 
tease when the same amount of enzyme was used 
(unpublished results). 
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Ubiquitin has been reported to exert chaperoning 
effects on fused proteins, thus increasing expression of 
proteins in E. coli and yeast [34-36]. The fused proteins 
can be cleaved by Ub-proteases (both UCH and UBP 
classes), but the enzymes are unstable, difficult to pro- 
duce, and often must be used in large quantities (an 
enzyme to substrate ratio of 1:1), making this technology 
impractical for large-scale protein production [18]. Our 
laboratory has exploited the chaperoning properties of 
several ubiquitin-like proteins including SUMO, and the 
extreme robustness of SUMO protease has allowed us to 
develop a technology that provides both enhanced 
expression and cleavage of the fusion protein. A number 
of difficult proteins have been expressed in our labora- 
tory in both unfused and SUMO-fused forms and com- 
pared side-by-side to demonstrate that SUMO-fusion 
dramatically enhances the expression of many types of 
proteins, including membrane proteins, and that SUMO 
protease cleaves a variety of SUMO-fusions with high 
specificity and efficiency over a wide range of various 
conditions, including pH (5.5—10.5) and temperature (4— 
37°C) [26]. Non-specific cleavage of the substrate was 
not observed, even when the amount of enzyme was 
deliberately increased to a 1:1 ratio [26]. In this study, 
titration of the hydrolytic capacity of SUMO protease 
on the purified SARS-CoV Ne proteins confirmed that 
SUMO protease is an extremely potent enzyme (Fig. 8). 
The predicted molecular weight of the SUMO protease 
is 26.7 kDa, though it usually runs at ~31 kDa position 
on a SDS-polyacrylamide gel. 

After evaluating and comparing various SARS-CoV 
proteins expressed with or without SUMO-fusion in E. 
coli under several culture and induction conditions, we 
found that SUMO-fusion significantly increased expres- 
sion of SARS proteins under nearly all conditions tested. 
We established a batch production protocol employing 
20°C overnight growth for large-scale expression of 
SUMO-SARS-CoV proteins, since a shorter time (e.g., 
6h) or higher temperature (37 °C) resulted in lower yields, 
especially for soluble proteins (data not shown). Although 
in most cases, cells growing in rich medium (LB) pro- 
duced more SUMO-fused protein than cells growing in 
minimal medium (MM), we will use MM to investigate 
secreted SARS-CoV proteins in future studies, since rich 
medium contains a large number of interfering proteins. 

In addition to producing SARS-CoV proteins in large 
quantities for basic research and for development of 
anti-SARS pharmaceutical agents, it 1s important to 
produce pure proteins that retain biological activity. The 
expressed and purified SARS-CoV proteins had immu- 
nological activity, but the question remains concerning 
their functional activities. It appears, at least in the case 
of one SARS-CoV protein, that SUMO-enhanced 
expression and purification from E. coli results in active 
protein. In a study to be published, SUMO-fusion 
enhanced expression of the SARS-CoV RNA-dependent 


RNA polymerase (RdRp), and the purified soluble 
RdRp was biologically active (unpublished results). 
Finally, we recently observed that SUMO-fusion 
significantly enhanced expression and purification of 
SARS-CoV membrane protein (M) as well. Using the 
SUMO-fusion technology described here, the expression 
level of SARS-CoV M protein in E. coli was greatly 
improved, and the insoluble proteins extracted from the 
bacterial inclusion bodies were purified [32]. Application 
of the various purified SARS-CoV proteins to the devel- 
opment of SARS vaccines and functional assays are 
underway. 
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